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Abstract 

Little is known about the range and the genetic bases of naturally occurring variation for flavonoids. Using 
Arabidopsis thaliana seed as a model, the flavonoid content of 41 accessions and two recombinant inbred line (RIL) 
sets derived from divergent accessions (Cvi-OxCol-0 and Bay-Ox Shahdara) were analysed. These accessions and 
RILs showed mainly quantitative rather than qualitative changes. To dissect the genetic architecture underlying 
these differences, a quantitative trait locus (QTL) analysis was performed on the two segregating populations. 
Twenty-two flavonoid QTLs were detected that accounted for 11-64% of the observed trait variations, only one QTL 
being common to both RIL sets. Sixteen of these QTLs were confirmed and coarsely mapped using heterogeneous 
inbred families (HIFs). Three genes, namely TRANSPARENT TESTA (TT)7, TT15, and MYB12, were proposed to 
underlie their variations since the corresponding mutants and QTLs displayed similar specific flavonoid changes. 
Interestingly, most loci did not co-localize with any gene known to be involved in flavonoid metabolism. This latter 
result shows that novel functions have yet to be characterized and paves the way for their isolation. 
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Introduction 

Plants are estimated to contain >200 000 metabolites 
(Dixon and Strack, 2003; Saito and Matsuda, 2010) of 
which —9000 flavonoids represent a significant proportion 
(Harborne and Williams, 2000; Williams and Grayer, 2004). 
These compounds with a Cd-C^-Cf, carbon framework are 
subdivided into different classes depending on the linkage of 
the aromatic ring to the central C3 moiety and its degree of 
oxidation. The major types of flavonoids are flavonols, 
anthocyanins, and flavan-3-ols [also called condensed tan- 
nins or proanthocyanidins (PAs)]. 

These metabolites are involved in many physiological 
mechanisms such as flower or fruit colour (Winkel-Shirley, 



2001), UV protection (Veit and Pauh, 1999; Ryan et ah, 
2001), interactions of plant with microbes, animals, or other 
plants (Harborne and Williams, 2000), abiotic stresses 
(Winkel-Shirley, 2002), or auxin transport (Taylor and 
Grotewold, 2005; Peer and Murphy, 2006; Kuhn et al, 
2011). Numerous laboratory and epidemiological studies 
suggest a beneficial effect of flavonoids for human health, 
preventing the occurrence of chronic age-related diseases 
such as cardiovascular diseases or certain cancers (Espin 
et al, 2007; Butelh et al, 2008; Luceri et al, 2008). 
These compounds are also responsible for major organolep- 
tic, nutritive, and processing characteristics of feed, 
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food, and beverages, and impact many agronomical crop 
traits (Winkel-Shirley, 2001, 2002). For instance, high 
concentrations of astringent PAs can have a negative 
impact on the nutritive value and palatability of forages. In 
contrast, the presence of tannins prevents pasture bloat that 
can be lethal for ruminants (Lee, 1992; Waghorn and 
McNabb, 2003). 

Being a powerful model for many biological questions, 
flavonoid biosynthesis is thus one of the best-studied 
metabolic pathways in plants and has been described in 
detail in numerous species. The common precursors are 
malonyl-CoA and ;j-coumaroyl-CoA that are condensed to 
chalcone intermediates by chalcone synthase (CHS) (Fig. 1 
and Lepiniec et al., 2006). Among the enzymes that shape 
the nature and content of accumulated flavonoids, flavonoid- 
3 '-hydroxylase (F3'H) converts dihydrokaempferol into dihy- 
droquercetin, flavonol synthase (FLS) catalyses flavonol 
synthesis from dihydroflavonols, and dihydroflavol 4-reduc- 
tase (DFR) is a common step toward anthocyanidins and 
PAs. Finally, anthocyanidin reductase (ANR) is the first 
committed step to PA synthesis (see Fig. 1). Regulatory 



proteins controlhng flavonoid biosynthesis have also been 
characterized, such as the MYB-bHLH-WDR (MBW) 
complex that is involved in biosynthesis of PAs and 
anthocyanins (Baudry et al. , 2006; Lepiniec et al , 2006) and 
the R2R3-MYBS PRODUCTION OF FLAVONOL GLY- 
COSIDE (PFG1/MYB12, PFG2/MYB11, and PFG3/ 
MYBUl) that positively regulate flavonol biosynthesis in 
root and the aerial part (Dubos et al, 2010; Stracke et al, 
2010fl, b), whereas single repeat small MYBs CAPRICE 
(CPC) or MYBL2 can negatively regulate anthocyanin 
synthesis (Dubos et al., 2008; Zhu et al., 2009). 

Arabidopsis is a good model species for the identification 
of genes controlling flavonoid metabolism, because it is 
amenable to both molecular and classical genetic analysis 
(Somerville and Koornneef, 2002: North et al, 2010). 
Several mutants affected in structural or regulatory genes 
have been shown to display typical flavonoid profiles that 
are consistent with the function of these genes or could help 
in their functional characterization (Pourcel et al, 2005; 
Routaboul et al, 2006; Marinova et al, 2007; Dubos et al., 
2008). It is worth noting that most flavonoid genes have 
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Fig. 1. The flavonoid biosynthetic pathway in Arabidopsis seed. The different steps leading to the formation of flavonoids in Arabidopsis 
seed are indicated by arrows. Major flavonoids are indicated in bold. Mutants for enzymatic steps are indicated in red lower case italic 
letters. Regulatory proteins are given in parentheses beside their target genes that are shown in upper case letters. ANR, anthocyanidin 
reductase; ANS, anthocyanidin synthase (LDOX, leucoanthocyanidin dioxygenase); CE, condensing enzyme; CHI, chalcone isomerase; 
CHS, chalcone synthase; DFR, dihydroflavonol-4-reductase; F3H, flavonol 3-hydroxylase, F3'H, flavonoid 3'-hydroxylase, FLS, flavonol 
synthase; glc, glucose; GST, gluthatione S-transf erase; GT, glycosyltransferase; hex, hexose; OMT, methyltransferase; PPO, polyphenol 
oxydase; rha, rhamnose; RT, rhamnosyltransferase; SGT, UDPglucose:sterol glucosyltransferase. Steps that still need to be 
characterized are indicated in green with a question mark. 
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been identified by genetic approaches based on the isolation 
of mutants that do not accumulate coloured compounds 
such as anthocyanins and oxidized PAs. These previous 
studies essentially revealed qualitative changes controlled by 
a single locus, rather than quantitative variations. The 
functions involved in the metabolism of less coloured 
compounds (such as flavonols), small quantitative varia- 
tions, and/or multigenic effects are more difficult to 
characterize (Toghe et ah, 2005; Saito and Matsuda, 2010). 

Recently, Arabidopsis flavonoids have been analysed 
using hquid chromatography-mass spectrometry (LC-MS) 
and/or nuclear magnetic resonance (NMR). Briefly, antho- 
cyanins and glycosylated kaempferol flavonols are mostly 
found in leaves (Tohge et al, 2005; Yonekura-Sakakibara 
et al, 2008), whereas seeds contain epicatechin, PAs, and 
larger amounts of glycosylated quercetin flavonols (Kerhoas 
et al, 2006; Routaboul et al, 2006). Monoglycosylated 
flavonols and PAs accumulate mainly in the seed coat, 
whereas diglycosylated flavonols are found in the embryo 
(Routaboul et al, 2006). PAs that are the most abundant 
flavonoids (before anthocyanins) in widely consumed fruits 
such as apples (Zhang et al, 2003; Wojdylo et al, 2008), 
strawberries (Almeida et al, 2007; Buendia et al, 2010), 
grapes (Mane et al, 2007), and seeds and grains (Lepiniec 
et al, 2006; Auger et al, 2010) have often been overlooked 
and underestimated, since they are not easily extracted 
(Arranz et al, 2009; Auger et al, 2010). Interestingly, 
Arabidopsis seed contains large amounts of PAs with 
structural characteristics that are similar to those found in 
related crop seeds or fruits (Almeida et al, 2007; Auger 
et al, 2010; Buendia et al, 2010). 

Despite this broad interest, very little is known about the 
range of natural variation of flavonoids and the genetic 
bases of such differences. Genetic analysis of natural 
variation in plants is mainly undertaken by quantitative 
trait locus (QTL) mapping (Trontin et al, 2010) although 
association genetics comes of age (AtweU et al, 2010). 
Phenotypic variation is associated with allelic variation at 
molecular markers segregating in mapping populations 
that are derived from crosses between parental lines 
(Alonso-Blanco et al, 2009). Recently, QTLs that govern 
anthocyanidin accumulation and ripening were detected in 
raspberry (Graham et al, 2009; Kassim et al, 2009; 
McCallum et al, 2010), pepper (Chaim et al, 2003), and 
grape (Fournier- Level et al, 2009). QTL analysis responsi- 
ble for flavonoid changes was also performed on apical 
tissues of poplar (Morreel et al, 2006). Regarding seeds, 
isoflavone content in soybean (Gutierrez-Gonzalez et al, 
2010), maysin variation (c- glycosyl flavone) in maize 
(Zhang et al, 2003; Meyer et al, 2007), or PA changes in 
beans (Caldas and Blair, 2009) have been investigated. In 
Arabidopsis, Keurentjes and collaborators (2006) have 
performed an LC-MS untargeted metabolomic analysis of 
seedlings and also detected a few flavonol QTLs. 

In this study, the metabolite profiling and genetic analysis 
of 41 Arabidopsis accessions and two RIL (recombinant 
inbred line) sets were performed to frame the range of 
natural variation of the major seed flavonoids and 
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the genetic architecture underlying these changes. Studying 
the variations for these flavonoids among RILs enabled the 
detection of QTLs that were confirmed and coarsely 
mapped using heterogeneous inbred families (HIFs). The 
metabohte profiles of numerous mutants of candidate genes 
located near these QTLs were analysed. Among them, 
MYB12 (R2R3 domain transcription factor), TT15 (UDP 
glucose:sterol-glucosyltransferase), and TT7 (F3'H) genes 
not only co-localized with the considered QTLs, but their 
mutants displayed similar specific phenotype variation. 
These three genes may thus underlie the variation of the 
studied accessions at these loci. Nevertheless, most of the 
characterized loci could not be associated with any known 
gene involved in flavonoid metabolism, showing that 
combined metabolite profiling and quantitative genetic 
studies can reveal new loci of interest. 



Materials and methods 

Plant materials 

Forty Arabidopsis tlmliana accessions from the core collection 
designed by the Biological Resource Centre in Versailles 
(McKhann et al, 2004; http://dbsgap.versailles.inra.fr/vnat/) were 
used to explore species-wide diversity. They were compared with 
the reference Columbia (Col-0) accession. A subset of 100 lines 
from the Cvi-OxCol-0 and from the Bay-Ox Shahdara RILs sets 
(obtained from http://dbsgap.versailles.inra.fr/vnat/; Loudet et al, 
2002; Simon et al. , 2008) optimized for QTL mapping were used to 
map QTLs. RILs that still segregated only for a limited region 
around QTLs were used to generate HIFs as previously described 
(Loudet et al., 2005). HIF seeds were also obtained from http:// 
dbsgap.versailles.inra.fr/vnat/. HIFs enable the comparison of the 
phenotypic consequences of the two parental alleles at the locus of 
interest in an otherwise identical (but heterogeneous) background. 
Ttl5-2 (COB 16, Ws-4 background) was also obtained from the 
Versailles Biological Resource Centre. All lines were grown in 
a controlled growth chamber in long days with a 1 6 h photoperiod 
at 170 [tE m^ s"' light intensity, and a 21 °C day/18 °C night 
temperature cycle, with constant humidity (65%). The GT72B1 
mutant (Col-0 background) was obtained from Robert Edwards 
(Durham University, UK), the 78B2 and 78B3 glycosyltransferase 
mutants (Col-0 background) from Kazuki Saito (RIKEN Plant 
Science Center, Japan), the PFGllMybll. PFG21MYB11, and 
PFG31MYBIII single and multiple mutants (Col-0 background) 
from Bernd Weisshaar and Ralph Stracke (Bielefeld University, 
Germany), and the anl2 (her background) mutant from Hiriyoshi 
Kubo (Shinshu University, Japan) 

Flavonoid analysis 

Extraction of seed flavonoids was carried out using a modified 
protocol adapted from Routaboul et al. (2006). Seeds of acces- 
sions, RILs, and HIF lines were grown in three biological repeats. 
For RILs, three representative seed aliquots from the three 
biological repeats were pooled before flavonoid extraction. All 
seed samples were ground for 90 s at maximum speed with 
a 'FastPrep-24 homogenizer' (MP Biomedicals, Solon, USA), lines 
derived from Cvi-Ox Col-0 in 1 ml of acetonitrile/water (3/1; v/v), 
and Bay-Ox Shahdara lines in 1 ml of methanol/acetone/water/ 
trifluoroacetic acid (30/42/28/0.05; v/v/v/v) to maximize PA extrac- 
tion. A 4 |j.g aliquot of apigenin was added as an internal standard. 
Following centrifugation the pellet was extracted further with 1 ml 
of the same solvent mixes overnight at 4°C. The two extracts were 
pooled. The peUet was preserved for insoluble PA analysis. LC-MS 
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analyses of individual flavonoids were realized as previously 
described (Kerhoas et al., 2006; Routaboul et al., 2006) using 
a 'Quattro LC with an ESI 'Z-Spray' interface (MicroMass, 
Manchester, UK), an AUiance 2695 RP-HPLC system, and 
a Waters 2487 UV detector set at 280 nm (Waters, USA). Flavonol 
contents were expressed relative to quercetin-3-O-rhamnoside, 
rutin, and epicatechin (Extrasynthese, France) external standards, 
or monoglycosylated and di-glycosylated flavonols and flavan- 
3-ols (PAs), respectively. 

PA oligomers and polymers were hydrolysed into coloured 
anthocyanidin and measured at 550 nm using a calibration curve 
made with commercial cyanidin chloride (Extrasynthese, France). 

QTL analysis 

QTL analyses were performed using the Unix version of QTL 
CARTOGRAPHER 1.14 (Lander and Botstein, 1989; Basten 
et al., 2000), and standard methods for interval mapping (IM) and 
composite interval mapping (CIM) (Loudet et al., 2003). First, IM 
(Lander and Botstein, 1989) was carried out to determine putative 
QTLs involved in the variation of the trait, and then CIM model 6 
of QTL CARTOGRAPHER was performed on the same data: the 
closest marker to each local LOD score peak (putative QTL) was 
used as a cofactor to control the genetic background while testing 
at another genomic position. When a cofactor was also a flanking 
marker of the tested region, it was excluded from the model. The 
number of cofactors involved in the models varied between one 
and three. The walking speed chosen for QTL analysis was 0. 1 cM. 
The global LOD significance threshold (2.3 LOD) was estimated 
from several permutation test analyses, as suggested by Churchill 
and Doerge (1994). QTL co-localization was considered only when 
different QTLs peaked in a window of =s5 cM (that was a priori 
chosen because it represents a conservative support interval). 
Additive effects ('2a') of detected QTLs were estimated from CIM 
results as representing the mean effect of the replacement of the 
Cvi (or Bay) alleles by Col [or Shahdara (Sha)] alleles at the locus. 
The contribution of each identified QTL to the total phenotypic 
variation (R^) was estimated by variance component analysis, 
using phenotypic values for each RIL. The model used the 
genotype at the closest marker to the corresponding detected QTL 
as random factors in analysis if variance (ANOVA), performed 
using the aov function in R. Only homozygous genotypes were 
included in the ANOVA. 

Hierarchical clustering was performed using Genesis 1.7.5 
(Institute for Genomics and Bioinformatics, Graz University of 
Technology, http://genome.tugraz.at). Distances were calculated 
using complete linkage clustering and Pearson correlations. 

The flavonoid contents of selected lines of the two RIL sets are 
given in Supplementary Tables S5 and 6 available at JXB online 
allowing calculation of minor QTLs that have not been presented 
in the Results section. 



Results 

Comparative seed flavonoid analysis between 
Arabidopsis accessions shows quantitative rather than 
qualitative differences 

The seed flavonoids accumulated in three widely used 
accessions, namely Col-0, Ws-4, and Ler, have been 
characterized previously (Kerhoas et al., 2006; Routaboul 
et al., 2006). Here this analysis was extended, selecting 40 
novel accessions from the Versailles core collection 
(McKhann et al., 2004), defined to maximize genetic 
diversity among 265 accessions distributed worldwide 
(Fig. 2A). Mature seed extracts were analysed using LC- 
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Fig. 2. Natural variation of seed flavonoids in Arabidopsis. (A) 
Hierarchical clustering analysis of mature seed flavonoid accunnu- 
lation in 40 accessions compared with Col-0. Log2 % of from 
three (*ortwo) independent measurements ±SE. Flavonoid con- 
tents for each accession and correlation between the different 
compounds are given in Supplementary Tables S1 and S2, 
respectively, at JXB online. (B) Boxplot analysis of flavonoid 
content in accessions giving the minimum, lower quartile, median, 
upper quartile and outlier, respectively, from bottom to top. G, 
glucoside; H, hexoside; I, isorhamnetin; insol., insoluble; K, 
kaempferol; Q, quercetin; PA, proanthocyanidin; R, rhamnoside; 
sol., soluble. 
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MS to quantify individually the different ilavonols. In 
addition, PA contents were assessed using acid-catalysed 
hydrolysis (Porter et al., 1986) both on the extract (hereafter 
called soluble PAs) and on the remaining pellet (hereafter 
called insoluble PAs). The first observation was that, among 
the accessions tested, essentially quantitative rather than 
qualitative variations were observed (Fig. 2; Supplementary 
Fig. SI and Supplementary Table SI at JXB onhne). All the 
flavonoids previously characterized in Col-0, Ws-4, or her 
were found in all the accessions. Five accessions illustrated 
the range of changes observed, namely Col-0, Cvi-0, Nok-1, 
Sp-0, and Sha (Fig. 2A; Supplementary Fig. SI). Col- 
0 contained the least kaempferol derivatives and was clearly 
different from other accessions. Cvi-0 had low quercetin 
3-0-rhamnoside content and, consequently, low levels of 
the derived biflavonols (Pourcel et al., 2005), whereas PAs 
were 3-fold higher. Interestingly, these three compounds are 
mainly accumulated in the seed coat (Routaboul et al., 
2006). Flavonols and PAs accumulated to the highest levels 
in the Nok-1 accession in which flavonoids account for 
1.7% of dry weight (DW). Sp-0 had the highest quercetin 
3-0-rhamnoside content with a concomitant increase in 
PAs. It should be noted that the largest variations in 
flavonoids were obtained for some seed coat-specific flavo- 
nols, such as quercetin 3-(9-rhamnoside (from 0.008% in 
Cvi-0 up to 0.6% of DW in Sp-0) or PAs (from 0.2% in 
Sav-0 up to 0.9%o DW in Gre-0), or kaempferol derivatives 
(from 0.04% in Col-0 to 0.3% DW in Nok-1). 

Analysis of Sha accessions uncovers new biosyntlietic 
step 

The Shahdara accession contained three novel flavonol- 
hexoside-rhamnoside derivatives. They possessed the same 
glycosylations but a different quercetin, kaempferol, or 
isorhamnetin aglycone ([M-i-H]^=611, 595, and 625; 
[M+H-hexose]^=449, 433, and 463; and [M-i-H-hexose- 
rhamnose]^=303, 287, and 317, respectively). These com- 
pounds had a retention time of ~1 min before the 
corresponding aglycone-3-C>-glucoside-7-C>-rhamnoside 
isomers and are thus different from these previously 
characterized flavonols (Kerhoas et al., 2006; Supplemen- 
tary Fig. S5 at JXB onhne). Nevertheless, Sha was also 
able to synthesize all the flavonols detected in Bay-0. This 
result suggested that a novel and specific glycosyl trans- 



ferase that catalyses the production of flavonol-hexoside- 
rhamnoside is active in Shahdara but not in Bay-0. 

Relationships between the contents of different 
flavonoids in mature seeds 

One could expect to observe some correlations between the 
accumulations of different flavonoids that belong to various 
subpathways or represent related quantitative traits. Alter- 
natively, the lack of correlation may reveal regulatory steps 
for which specific QTLs should be detected. These correla- 
tions were measured and are depicted as a tree (Fig. 2A; 
Supplementary Tables S2-S4 at JXB online). Some of 
these statistically significant correlations could be foreseen, 
such as the one between the accumulation of precursor 
quercetin-3-O-rhamnoside and its derived biflavonol prod- 
ucts (Pourcel et al, 2005) (r=0.71, P < 0.0001) or between 
soluble and insoluble PAs (r=0.84, P < 0.0001). This also 
shows that some accessions such as Cvi-0 or Bur-0 display 
a more contrasted quercetin-3-O-rhamnoside/biflavonol ra- 
tio relative to that of Sp-0 or Edi-0. However, the 
correlation between kaempferols and PAs was unexpected 
(f=0.52, P < 0.0001 and r=0.33, ^=0.048 between soluble 
PA and kaempferol-3,7-di-rhamnoside or kaempferol-3-(9- 
glucoside-7-(9-rhamnoside, respectively). 

A clustering of accessions based on their flavonoid profile 
was carried out (Fig. 2A). Five major groups could be 
distinguished. In Arabidopsis, it is often difficult to associate 
specific genotypes with geographic origin (Anastasio et al, 
2011) since human activities tend to homogenize variation 
among populations, especially in Europe and North America, 
and recolonization events from circum Mediterranean glacial 
refugees have also been proposed to occur (Mitchell-Olds and 
Schmitt, 2006). The accessions of cluster 1 contained a higher 
level of phenotypic variation than other clusters (these 
accessions could be considered as separate clusters) and were 
characterized by less kaempferol-3-O-rhanmoside. Cluster 2 
represented western European accessions (Fig. 3) that accu- 
mulated less quercetin derivatives and more PAs and 
kaempferol (such as Cvi-0; see Supplementary Fig. SI at JXB 
onhne). Cluster 3 contained a single accession from The 
Netherlands, Nok-1, that stood alone away from the other 
accessions due to its overall high levels of flavonoids. 
Accessions of cluster 4 included many central European 
accessions that had, on average, more quercetin derivatives 




Fig. 3. Geographical distribution of studied accessions (dark blue, light blue, pink, green, and yellow dots correspond to clusters 1-5 of 
Fig. 2, respectively) 
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but less PAs (such as Sp-0), whereas cluster 5 is the only 
group containing Asian and North American accessions that 
appeared to contain more PAs (such as Shahdara). 

Extended flavonoid variation in two recombinant inbred 
line sets 

To dissect these natural variations genetically, selected 
progeny of two RIL populations, Cvi-OxCol-0 and 
Bay-Ox Shahdara (Loudet et al., 2002; Simon et al., 2008), 
were analysed. Correlations between the different flavo- 
noid contents in the Cvi-OxCol-0 RIL set were similar to 
those observed among the accessions (Fig. 4A; Supplemen- 
tary Table S3 at JXB online). In contrast, correlations 
were generally weaker or no longer existent in the 
Bay-OxShahdara RIL population, such as that between 
quercetin-J-O-rhamnoside and one of its products the 
biflavonols (r— -0.07, P > 0.5), between soluble and 
insoluble PAs (r=0.48, P < 0.0001) or between PAs and 



kaempferols (Fig. 4B; Fig. 4A; Supplementary Table S3). 
Variations in flavonoid content in both RIL populations 
are presented in Fig. 4C and D. The two RIL populations 
showed transgressive segregation from their parents for 
most traits, especially for diglycosylated quercetins and 
isorhamnetin derivatives that displayed small differences 
among the parents. This should indicate that all four 
parents have positive-effect alleles for these compounds 
and that numerous QTLs are likely to be detected. 

Developing seeds from the four parental accessions were 
also analysed (Fig. 5) to uncover additional compounds 
that are not detected in mature seed. All four accessions 
contained a novel diglycosylated quercetin, namely querce- 
tin-rhamnoside-glucoside, which differs from the two quer- 
cetin-glucoside-rhamnosides described above (quercetin-3- 
O-glucoside-7-O-rhamnoside and quercetin-hexoside-rham- 
noside from Shahdara). The accumulation of this new 
compound could be associated with that of quercetin 3-0- 
rhamnoside since both compounds were lower in Cvi-0 and 
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Fig. 4. Natural variation among recombinant inbred lines (RILs) derived from the Cvi-OxCol-0 and Bay-Ox Shahdara crosses. Relationships 
between mature seed flavonoid contents in two RIL populations Cvi-OxCol-0 (A) and Bay-Ox Shahdara (B). log2 % of Col-0 or Bay-0 and 
boxplot analysis for each flavonoid giving the minimum, lower quartile, median, upper quartile, and outlier, from the bottom to the top (C and 
D). G, glucoside; H, hexoside; I, isorhamnetin; insoL, insoluble; K, kaempferol; Q, quercetin; PA, proanthocyanidin; R, rhamnoside; sol., soluble. 
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Fig. 5. Flavonoid content in developing seed from RIL parental lines: Col-0 (dark) and Cvi-0 (white) (A-D), Bay-0 (dark) and Shahdara 
(Sha, white) (E-H). Values represent the data obtained from one representative experiment (among experiments). All individual 
compounds measured with LC-MS. EC, epicatechin; G, glucoside; H, hexoside; I, isorhamnetin; K, kaempferol; Q, quercetin; B2, 
epicatechin dimer; R, rhamnoside. 



Shahdara compared with Col-0 and Bay-0. Additionally, 
quercetin-hexoside-rhamnoside 2, detected only in Shah- 
dara, accumulated steadily during seed development but at 
very low levels (Fig. 5F). 

QTL analysis uncovers 22 flavonoid QTLs, of whicfi only 
one is common to the two populations 

A total of 22 significant QTLs involved in flavonoid 
variation (termed 'FLA') were detected in the two RIL 



populations. The chromosome location of each QTL is 
presented in Table 1 together with its significance (LOD 
score), additive effects (a), and the percentage of total 
variance explained for the given flavonoid (R^). These QTLs 
represent from 11% to 61% of the flavonoid variation. Most 
QTLs were detected in only one of the two mapping 
populations. Nevertheless, one locus involved in kaempferol 
changes could be common to both RIL populations (FLA5/ 
FLA15, Table 1, located at ~3 Mb on chromosome 5). The 
co-localization of QTLs for quercetin 3-0-rhamnoside and 
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Table 1. Mapped QTLs that account for variation in flavonoid accumulation in mature seed of two RILs: Cvi-OxCol-0 and Bay- 
Ox Sliafndara 

Flavonoid contents of parental lines are means of four independent experiments ±SE. Clir, cliromosome; the position in centiMorgans (cM) is 
tliat from tine first marker on tine cfiromosome; 2a, tine additive effect represents tine mean effect in mg g^^ seed of the replacement of Cvi-0 (or 
Shadara, Sha) alleles by Col-0 (or Bay-0) alleles at a given QTL; Fp, percentage of the total phenotypic variance for a given flavonoid explained 
by the QTL. G, glucoside; h, hexoside; I, isorhamnetin; insoL, insoluble; K, kaempferol; Q, quercetin; PA, proanthocyanidin; R, rhamnoside; 
sol., soluble. 



Cvi-OxCol-0 


Col-0 (mg g 


^ seed) 


Cvi-0 (mg g '' seed) QTL name 


Chr 


Marker 


Position [cM (Mb)] 


LOD 


2a 


(%) 


Q-3-0-R 


3.24±0.15 




0.29±0.02 FI_A1 


1 


c1_26993 


121.7 (28) 


4.22 


0.64 


25 








f\J\2 


2 


c2_17606 


84.7 (18.5) 


2.76 


0.50 


16 


Biflavonols 


0.39±0.06 




0.07±0.01 FLAS 


1 


o1_26993 


122.7 (27) 


2.90 


0.01 


16 


K-3,7-di-0-R 


0.18±0.04 




0.70±0.08 FU\4 


4 


o4_06923 


39.4 (7.5) 


2.49 


-0.20 


13 








FLAS 


5 


o5_02900 


12.9 (3) 


2.91 


-0.22 


17 








FLAG 


5 


c5_07442 


34.0 (8) 


2.89 


-0.24 


18 


PA soluble 


2.73±0.50 




7.04±0.57 FIJ\7 


2 


c2_11457 


49.0 (11.5) 


2.74 


-1.80 


15 








FIJ\8 


5 


o5_05319 


24.9 (6) 


2.59 


-1.94 


17 


PA insoluble 


7.33±0.90 




13.68±0.51 FLA9 


2 


o2_1 7606 


87.0 (19) 


3.13 


-2.40 


17 


Bay-OxShahdara 


Bay-0 (mg g 


^ seed) 


Shahdara (mg g^ seed) Name 


Chr 


Marker 


Position [cM (Mb)] 


LOD 


2a 




Q-3-0-R 


5.59±0.28 




3.94±0.25 FU\10 


1 


IVISAT1 .5 


69.4 (23) 


2.34 


-0.70 


1 1 








PI A 1 -I 
rU\ 1 1 




IVloA 1 4. 1 O 


OO.O \ 




u. / o 


1 4 








FIJ\12 


5 


IVISAT520037 


69.4 (21) 


5.31 


1.04 


25 


Q-H-R 2 


0.00±0.00 




0.63±0.14 FIJ\13 


5 


NGA151 


21.9 (7) 


14.24 


-0.56 


61 


Biflavonols 


0.44±0.01 




0.61 ±0.06 FLAM 


5 


IVISAT518662 


65.2 (19.5) 


14.07 


-0.22 


53 


K-3-0-R 


0.10±0.01 




0.17±0.01 FLA15 


5 


NGA249 


12.1 (3) 


9.87 


-0.06 


40 


l-R 


0.07±0.01 




0.08±0.01 FIJ\16 


3 


IV1SAT305754 


14.4 (7.5) 


3.23 


-0.02 


19 








FIJ\17 


4 


IVISAT4.15 


33.5 (9.4) 


2.94 


0.02 


14 


PA soluble 


2.87±0.22 




3.45±0.18 FIJ\18 


1 


dCAPsAPR2 


55.4 (18.5) 


2.39 


-1.02 


13 








FIJ\19 


4 


IV1SAT4.39 


1.0 (0.25) 


5.41 


-1.44 


24 








FLA20 


5 


JV7576 


79.0 (24) 


3.35 


1.10 


15 


PA insoluble 


1.74±0.11 




1.34±0.07 FLA21 


4 


IV1SAT4.9 


56.8 (16) 


3.25 


0.22 


16 








FLA22 


5 


JV6162 


74.9 (23) 


2.96 


0.22 


14 



biflavonols (FLA1/FLA3 and FLA12/FLA14), PAs and 
quercetin 3-0-rliamnoside (FLA2/ FLA9 and FLA 11/ 
FLA21), and PAs (FLA20/ FLA22), as well as the direction 
of their predicted allelic effects are consistent with the 
phenotypic correlations observed between some flavonoids 
in parental accessions and RIL populations. 

Sixteen FLA loci are confirmed using l-ilF lines 

HIFs, generated from the residual heterozygosity stifl 
segregating in some Fg RILs (Loudet et al., 2005), were 
used for further characterization (mapping and analysis) of 
the QTLs. Each HIF contains a short region fixed for one 
or other parental allele in an otherwise identical genetic 
background. From the 22 QTLs characterized using the two 
RIL populations, 16 were confirmed in HIFs that showed 
the expected variation (for both the direction and amplitude 
of the variations) (Table 2; Supplementary Figs S2, S3 at 
JXB online). FLAl, 3, 5, 11, 13, 15, 19, and 21 were 
vahdated with at least two independent HIF fines. Metab- 
olite changes within the HIFs provided additional informa- 
tion about the flavonoid phenotypes and, in several cases, 
explained the occurrence of suggestive loci (1 < LOD < 2.5) 
detected with the RILs. This vafidates the quality of the 



data and the conservative nature of the QTL thresholds. 
The flavonoid contents of selected lines of the two RIL sets 
are given in Supplementary Tables S5 and S6. 

Flavonoid analysis of ttie myb1 2 mutant provides 
a candidate gene for the FU\2 locus and also shows 
that MYB12 controls flavonol accumulation in the seed 
coat 

PFG1/MYB12, PFG2/MYB11, and PFG3/MYB111 tran- 
scriptionally control flavonol biosynthesis in root and aerial 
parts (Dubos et al., 2010; Stracke et al., 2010(7, b), whereas 
the single repeat R3 MYB CPC can negatively regulate 
anthocyanin synthesis (Zhu et al., 2009). MYB12 and CPC 
genes co-localize with the FLA2 locus involved in variation 
of quercetin-3-O-rhamnoside content (Table 2; Supplemen- 
tal Fig. S2C at JXB online). The cpc and mybl2 mutants 
as well as single and multiple pfg mutants were analysed 
(Fig. 6; Supplementary Fig. S4). The cpc mutant (in Ws-4) 
did not show any significant flavonol change and is thus less 
likely to control the FLA2 QTL. The two mybl2 mutant 
alleles (and multiple mutant combinations with mybl2 (in 
the Col-0 background) were mainly affected in quercetin-3- 
0-rhamnoside and biflavonol accumulation. These two 



Natural variation for flavonoids in Arabidopsis \ 3757 



Table 2. Confirmation of the major QTLs detected in Cvi-OxCol-0 and Bay-OxShahdara by analysis of the phenotypes segregating in 
diverse heterogeneous inbred families (HIFs) 

The HIF name indicates the corresponding recombinant inbred lines from the Cvi-OxCol-0 (SHV) or Bay-OxShahdara (SSHV) sets showing 
residual heterozygosity in the region of the QTLs. The position of segregating markers as well as some potential candidate genes included in 
this interval are indicated. Values for each trait indicate the change (%) in trait value when comparing the two alleles fixed in the segregating 
region for each HIF. Positive (versus negative) values indicate that Col or Bay allele is increasing (decreasing) the trait relative to the alternative 
allele; numbers in bold show significant changes between HIF alleles, and grey areas indicate when a QTL was detected (see Table 1). The 
same colour shows the flavonoid change corresponding to a given locus. Significance in f-test at the *5%, **1 %, and *** 0.1 % level. G, 
glucoside; h, hexoside; I, isorhamnetin; insol., insoluble; K, kaempferol; 0, quercetin; PA, proanthocyanidin; R, rhamnoside; sol., soluble. 
Additional information is given in Supplementary Figs S2 and S3 at JXB online. 
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8HV215 
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FIJ\1, 
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39* 


8HV258 




FIJ\3 
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8HV344 
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FIJ\7 


10 250 




8HV218 
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8HV41 1 
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FIJ\2 
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40 
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MYB12 (19.5) 
F3'H (2.6) 

TT15 (1.6) 



DFR (17.2), 
TT10 (19.5), 
CHI (20.4) 

F'3H (2.6), 

UGT 78D3 and 2 (5.6) 



compounds are essentially accumulated in the seed coat and 
were also controlled by the FLA2 QTL (the QTL for 
biflavonol was only marginally suggestive with a LOD of 1.1). 
In addition, mybll and myblll mutants and double or 
triple mutant with mybll and myblll alleles had lower 
diglycosylated tlavonol contents, that were essentially accu- 
mulated in the embryo. Interestingly, the triple mutant 
contained more soluble PAs (as previously observed in the 
flsl mutant; Routaboul et al, 2006). This specific pattern of 
accumulation is consistent with a role for these closely 
related R2R3-MYBs in the control of flavonol accumula- 
tion through the early biosynthesis genes, in distinct parts of 
the seed, as previously observed in seedlings (Stracke et al, 
2007, 2010fl, b; Dubos et al, 2010). The changes observed in 
the myhl2 mutant suggested that this MYB12 gene is 
a strong candidate for FLA2. However, the HIF at the 
FLA2 locus also showed modifications of diglycosylated 



flavonols (see Supplementary Fig. S2C), and suggestive 
QTLs (1 < LOD < 2) for these compounds were also 
detected. These results may thus reveal an additional QTL 
at the end of chromosome 2. Alternatively, the genetic 
modification at the FLA2 QTL could be more complex than 
a simple loss of function of the mybll gene or the genetic 
background of the RILs/HIFs could modify its output 
through epistasis. 

Neither ANL2 nor 72B1 glycosyltransferase are involved 
in PA accumuiation 

ANTHOCYANINLESS (ANL2) is a homeobox gene that 
affects anthocyanidin distribution in vegetative tissues 
(Kubo et al, 1999). GT72B1 is a glycosyltransferase which 
is the most closely related gene to UGT72L1 that is 
involved in epicatechin-3'-glucoside synthesis in Medicago 
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Fig. 6. Flavonoid analysis of mutants for genes located near the FLA loci. G, glucoside; H, hexoside; I, isorhamnetin; K, kaempferol; 
Q, quercetin; PA, proanthocyanidin; R, rhamnoside; sol., soluble Significance in f-test compared with the wild type at the *5%, and 
***0.1% level. 



(Pang a/., 2008). Both genes co-localized with the FLA 19 showed significant variation in seed PAs, suggesting that 
QTL (Table 2; Supplementary Fig. S3B at JXB online), this variation cannot be explained by a loss-of-function 
Nevertheless, neither anl2 (Ler) nor gt72bl (Col-O) mutants allele of any of these genes in Shahdara. 
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78D2 glycosyltransferase are implicated in seed flavonol 
glucosylation 

A cluster of three highly homologous glycosyltransferases, 
namely 78D2, 78D3, and At5gl7040, that could be involved 
in the accumulation of a new flavonol-hexoside-rhamnoside 
found in Shahdara (Supplementary Fig. S5 at JXB online) is 
located in the region of the FLA13 locus (Supplementary 
Fig. S3D). Interestingly, 78D2 has been shown to be involved 
in anthocyanidin and flavonol glucosylation in leaves (Tohge 
et al., 2005; Kubo et al, 2007) and 78D3 is a flavonol 
arabinosyltranferase in leaves (Yonekura-Sakakibara et al., 
2008), whereas the At5gl7040 product has not yet been 
functionally characterized. Unfortunately, neither wild-type 
Col-0 nor the corresponding Col-0 mutants accumulate the 
additional quercetin derivative, so their involvement could 
not be tested (Fig. 6). 

However, the 78D2 mutant still contained isorhamnetin 
3-(9-glucoside-7-(9-rhamnoside when kaempferol or querce- 
tin 3-0-glucoside-7-0-rhamnoside was absent. This showed 
that the 78D2 flavonol-3-glycosyltransferase solely catalyses 
the addition of a glucose moiety on kaempferol and 
quercetin aglycone but not on isorhamnetin. This also 
means that another, stiU unknown, glycosyltransferase 
transfers a glucose onto the isorhamnetins. Flavonol- 
arabinoside could not be detected in the seed, and the 78D3 
glycosyltransferase mutant did not show any significant 
flavonoid changes. Other genes involved in flavonoid 
synthesis are located close to the FLA 13 locus, such as the 
Bsister MADS domain TT16, the glutathiones-transferase 
TT19, or chalcone synthase {CHS); however, their modifica- 
tions are unlikely to produce such specific variation in 
a single flavonol. 

HI F analysis around tine loci FLAT 2, 14, 20, and 22 
suggests a complex genetic basis for the observed 
variation in flavonoids 

The QTLs explaining the variation of quercetin-J-0- 
rhamnoside, biflavonols, and soluble PA located at the end 
of chromosome 5 (FLA12, 14, 20 and 22) could be related to 
the LAC151TT10 gene. TTIO encodes a laccase-like enzyme 
involved in oxidation of quercetin-3-O-rhamnoside to bifla- 
vonols and of epicatechin monomer and oligomers to 
oxidized procyanidins in the Arabidopsis seed coat (Pourcel 
et al, 2005). Indeed, quercetin-3-O-rhamnoside and soluble 
PA contents were higher in plants fixed for the Bay-0 fixed 
allele [see additive effect (a) in Table 1] when biflavonols are 
more abundant in plants fixed for the Shahdara allele. 
However, HIF410, heterozygous around FLA12, only 
showed an accumulation of biflavonol with the Shahdara 
allele (Table 2; Supplementary Fig. S3E at JXB online). 
HIF108 on the lower ami of chromosome 5 displayed higher 
soluble PA content (and perhaps quercetin-3-O-rhamnoside), 
whereas HIF093 segregated for higher quercetin- J- 0-rham- 
noside content with the Bay-0 allele (and possibly less 
biflavonols as observed for HIF410). Finally, these results 
suggested that the metabolic variations observed for the 
FLA 12, 14, 20, and 22 loci are probably not explained by 



TTIO (LAC 15) polymorphism and that biflavonol and PA 
variations could be controlled by different loci or are 
subjected to complex epistatic interactions. 

tt? and ttl5 mutants display tine same specific flavonoid 
variations predicted at the FLA5/15 and FLAT 0/1 8 loci 

HIFs fixed for the Cvi-0 or Shahdara alleles at the FLA5 
or FLA 15 locus contained more kaempferol derivatives 
than those fixed for the Col-0 or Bay-0 alleles, respectively, 
both in seeds and in leaves (Supplementary Fig. S6 at JXB 
online). Around FLA5, several genes belong to the 
flavonoid pathway, namely F3'H (TT7), FLS, and CHS. 
However, CHS alteration should affect the accumulation 
of all flavonoids (Routaboul et al., 2006). On the same 
line, the selective reduction of kaempferol derivatives 
observed both in seeds and in leaves (see Supplementary 
Fig. S6 at JXB online) is unlikely to be related to 
a modification of the FLS enzyme that uses both dihy- 
droquercetin and dihydrokaempferol as substrates for 
quercetin and kaempferol production, respectively. A 
putative candidate for the FLA5 and FLA 15 QTL was the 
F3'H enzyme that converts dihydrokaempferol into dihy- 
droquercetin, the inhibition of which produces an increase 
in dihydrokaempferol and a decrease in quercetin deriva- 
tives, in the tt7-4 mutant (Routaboul et al., 2006). Finally, 
TT15 (DeBolt et al., 2009) is involved in PA accumulation 
and the corresponding gene is located near FLA 10 and 
FLA18. The two ttl5 mutant alleles (in the Col-0 and 
Ws-4 background) had reduced amounts of quercetin-3-0- 
rhamnoside and PAs (Fig. 6) that could match the 
observed variation linked to the FLA 10 and FLA 18 loci, 
respectively. 

Discussion 

Large quantitative variations for flavonoids are observed 
in Arabidopsis seed 

The seed flavonoids of 41 accessions grown in controlled 
conditions have been analysed to gain a first insight into the 
naturally occurring variation in Arabidopsis. They were 
chosen among 265 worldwide accessions to maximize 
genetic diversity (McKhann et al., 2004). These secondary 
metabohtes, at first sight, appear to be mostly dispensable 
in Arabidopsis, because the CHS mutants (tt4) that lack 
flavonoids showed limited adverse effects (Ylstra et al., 
1996; Brown et al., 2001; Buer and Muday, 2004) at least 
under laboratory conditions. Nevertheless, afl flavonoids, 
flavonols, and procyanidins were detected in afl the 
accessions that were analysed. However, large quantitative 
variations were observed for seed flavonoids that were 
mainly due to quercetin-3-O-rhamnoside and PAs that 
accumulate in the seed coat. For instance, in Cvi-0, the 
amount of quercetin-3-O-rhamnoside was —1% of that 
found in the Sp-0 accession. These quantitative variations 
were amplified, probably due to transgression, in the two 
RIL populations. Finally, the correlation between the 
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accumulation of different flavonoids observed in acces- 
sions or in tlie two RIL populations were usually 
conserved. These observations confirmed that this metab- 
olism is highly regulated in Arabidopsis. A notable 
exception to these quantitative changes are three new 
flavonol-hexoside-rhamnosides found in the Shahdara 
accession that are presumably isomers of the known 
flavonol-3-0-glucoside-7-0-rhamnoside accumulated in 
the other accessions. This result suggested that a novel 
and specific glycosyl transferase that catalyses the pro- 
duction of flavonol-hexoside-rhamnoside isomers is active 
in Shahdara but not in Bay-0. 

A limitation of the chemical analysis using quadrupole 
mass spectrometry [rather than a time-of-flight (TOF); 
Keurentjes et al, 2006)] is that the characterization is 
limited to major UV-detected peaks and their derivatives, 
and thus minor compounds may be overlooked. Never- 
theless, in Arabidopsis seedhngs, a wider LC-MS untar- 
geted screening of accumulated metabolites has been 
previously performed, revealing six different flavonols 
present in the two studied accessions {her and Cvi-0). 
Comparative analysis of seven oilseed rape genotypes 
(Auger et al, 2010), almond (Prison and Sporns, 2002), or 
fruit such as apples (Wojdylo et al, 2008), strawberries 
(Almeida et al, 2007), or grapes (Mane et al, 2007) also 
revealed essentially quantitative rather than qualitative 
changes. 

Flavonoid accumulation is significantly controlled by 
a limited number of additive loci, of which only one 
seems common to both RIL sets 

Detected QTLs account for 11-61% of the observed pheno- 
typic variation, suggesting that flavonoid accumulation in 
seeds is under the genetic control of a few additive loci, 
similarly to anthocyanin content in grape berry (Fournier- 
Level et al, 2009). Most loci were validated with two or 
more independent HIF lines with consistent phenotypic 
variation related to the segregating alleles at a given locus in 
different genetic backgrounds. This suggests that epistasis is 
usually not decisive in deteimining seed flavonoid content in 
the materials and conditions used here. In contrast, analysis 
of isoflavones in soybean seeds revealed QTLs that account 
for <5% of allelic differences (Melchinger et al, 1998; 
Gutierrez-Gonzalez et al, 2010). In the present analyses, 
only one QTL could be common to the two populations. 
In seedlings of a Cvi-0 xLer population, a QTL for flavonol 
content was also detected at —90 cM on chromosome 1 
that was not detected in the populations examined here 
(Keurentjes et al, 2006). This shows that the studied 
accessions have retained different genetic variations for 
shaping flavonoid accumulation (McMullen et al, 1998). 

MYB12, TTIS, and TT7 genes are candidates for the 
control of the observed natural flavonoid variations. 

In total, three QTLs could be associated with a known 
candidate gene, MYB12 (R2R3 domain transcription fac- 



tor), TT15 (UDP glucose:sterol-glucosyltransferase), and 
TT7 (F3'H, flavonoid-3'-hydroxylase). Further molecular 
characterization of these candidates, including quantitative 
expression analysis in HIF fines, promoter GUS reporter 
gene analysis, and allelic complementation will be needed to 
assess tfie mechanisms involved in natural variation. 

The most promising candidate for controlling kaempferol 
contents (around FLA5 and FLA 15) was the F3'H gene, 
which encodes the enzyme converting dihydrokaempferol 
into dihydroquercetin. Mutations at F3'H led to the 
accumulation of kaempferol derivatives (Kerhoas et al, 
2006; Routaboul et al, 2006). Col-0 compared with Cvi-0 
accessions and the two independent HIF lines fixed for the 
Col allele showed a similar decrease in all kaempferol 
derivatives, suggesting that the Cvi F3'H allele could be 
limiting. The FLA5 locus was mapped between 0.0 Mb and 
5.3 Mb, and the FLA15 QTL around marker NGA249 at 
2.8 Mb, close to the F3'H gene position (2.5 Mb). In maize, 
the prl locus was recently characterized and shown to 
correspond to a F3'H gene (Sharma et al, 2011). This prl 
locus was detected as a major QTL for the synthesis of 
C-glycosyl flavones that have insecticidal activity against 
corn earworm (Lee et al, 1998; Cortes-Cruz et al, 2003). 

Most QTLs that have been characterized showed genetic 
variation in Myb factors regulating transcription. For in- 
stance, MYB12 in the present study is possibly involved in 
the control of flavonol content. The white grape phenotype is 
also caused by the insertion of a transposable element in the 
promoter of the VvMYBA transcription factor that regulates 
a VvUFGT glycosyltransferase needed for anthocyanin 
accumulation (Kobayashi et al, 2004; Fournier-Level 
et al, 2009; This et al, 2007). Elsewhere, the PI locus in 
maize was governed by two dupUcated Myb genes (Zhang 
et al, 2003). Additional experiments measuring the level of 
expression — rather than metabolites — in leaves detected 
PAPl, TTGl, and TTG2 as candidate genes in eQTL studies 
(Kliebenstein et al, 2006). 

Most of the characterized QTLs may correspond to 
novel functions 

Interestingly, although >60 genes involved in flavonoid 
metabolism have already been characterized, most of the 
FLA QTLs may correspond to new functions, directly (i.e. 
new regulators, transporters, etc.) or indirectly (i.e. de- 
velopmental genes or regulatory genes of higher hierarchical 
order) involved in this metabolic pathway. This rather 
unexpected number of new loci involved in the natural 
variation of flavonoids may be due to the fact that QTL 
analysis can reveal subtle quantitative and/or additive 
changes that have been overlooked in previous visual 
screens (Trontin et al, 2011). Co-localization of different 
QTLs might also be a first indication that some loci have 
a pleiotropic effect, due to a common mechanistic basis. 

FLA5 and FLA 15 co-localize with Flowering Locus C 
that encodes a transcription factor involved in the re- 
pression of flowering (Michaels and Amasino, 1999). 
Nevertheless, although HIF157 segregated for both (i.e. 
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flowering time and flavonoid) phenotypes, the HIF216 
segregated only for flavonoid variations. This indicated that 
the flavonoid and flowering time changes around the FLC 
locus have independent genetic bases. 

Flavonol and PAs have been proposed to be important 
for seed quality (i.e. germination, dormancy, and longevity; 
Debeaujon et al, 2000; Thompson et al., 2010). The 
variation in flavonoid identified in this study may thus be 
indirectly related to previously identified QTLs for seed 
quality. CDG3 and CDG6 that account for germination at 
low temperature in the dark in the Bay-Ox Shahdara 
population (Meng et al., 2008) may correspond to FLA4, 
17, and 19. DOG4 and 5 (Bentsink et al, 2007, 2010) that 
are related to a delay in germination co-localize with 
FLAll/21 and FLA5/15. Other loci (e.g. GW1/SSR2, 
OSRl, and GW2) involved in the control of germination 
under moderate osmotic and salt stresses co-localize with 
FLA12, 17, and 21, respectively (Vallejo et al, 2010). GRS, 
an enhancer of abi-3-5, that affects seed longevity (Clerkx 
et al., 2003), co-localized with FLA 19 responsible for 
increased PA accumulation in Sha relative to Bay-0. The 
flavonoid content of the two RIL sets given in Supplemen- 
tary Tables S5 and S6 at JXB online will allow a finer 
comparison of the data with previous QTL analysis for the 
above flavonoid-related traits or others. 

In summary, the metabolic analysis of 41 accessions and 
two RIL populations revealed the broad variation of seed 
flavonoid accumulation in Arabidopsis (and three new 
flavonol derivatives). The characterization of 22 QTLs in 
the two RIL populations dissected the genetic architecture 
underlying this natural variation. Most of the traits are 
controlled by a few additive loci with relatively broad 
effects. Further studies with the genotypes described here 
will be required to confirm candidate loci such as TT7, 
TT15, or MYB12. This work also paves the way for 
identifying novel genes that correspond to the other QTLs. 
More broadly, this study shows the potential of combining 
metabolomics and quantitative genetic for the characteriza- 
tion of new genes and novel markers for crop improvement 
that have not been revealed by previous qualitative screen. 



Supplementary data 

Supplementary data are avilable at JXB onhne, 

Figure SI. Natural variation of seed flavonoid content in 
five contrasted accessions of Arabidopsis. 

Figure S2. Confirmation of the major QTLs of the 
recombinant population Cvi-OxCol-0 by comparison of the 
phenotypes of heterogeneous inbred families (HIFs). 

Figure S3. Confimiation of the major QTLs of the 
recombinant population Bay-0 and Shahdara by comparison 
of the phenotypes of heterogeneous inbred families (HIFs). 

Figure S4. Mutation in 72B1 and ANL2, or CPC cannot 
explain natural variation corresponding to QTL FLA 16 and 
FLA2, respectively. 

Figure S5. Three additional glycosylated flavonols in the 
Shahdara genotype. 



Figure S6. QTLs 5, 13, and 15 are also confirmed in 
leaves using HIF lines (HIF223 and 301, HIF157 and 216, 
and HIF157 and 214, respectively). 

Table SL Flavonoid content (mg g"') in accessions. 

Table S2. Correlations (r and /"-values) between the 
different flavonoids in selected accessions. 

Table S3. Correlations {r and /"-values) between the 
different flavonoids in selected recombinant inbred lines of 
Cvi-OxCol-0. 

Table S4. Correlations {r and /"-values) between the 
different flavonoids in selected recombinant inbred lines of 
Bay-0 X Shahdara. 

Table S5. Flavonoid content in selected Cvi-OxCol-0 RIL 
fines. 

Table S6. Flavonoid content in selected Bay-Ox Shahdara. 
RIL lines. 
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